## Replication data and code for: "Labeling Social Media Posts: Does Showing Coders Multimodal Content Produce Better Human Annotation, and a Better Machine Classifier?"

Authors: Haohan Chen, James Bisbee, Joshua A. Tucker, Jonathan Nagler

---

- Please run `Replication_Code_Fig_Tab.Rmd` to replicate the key figures and tables in the paper and the appendices.
- The `Figures_Tables` folder contains the figures and tables generated by the replication code.
- The file `Replication_Code_Fig_Tab.html` is the log file generated by the replication code, including all code, figures, and tables.
- Please use the table of content in the log file to navigate the code, figures, and tables.
- The `Data` folder contains the data used in the paper.
- `Code_FitModels` contains code that fit machine learning classifiers used to generate the results in the paper. However, the code cannot be run directly because it requires access to the text of Twitter posts, which we have removed for privacy reasons. However, readers can use `tweet_id_str` to download the text of the tweets using Twitter's API (if the tweets have not been removed by the authors or the platform).

---

- Machine information
  - System: MacOS 15.3.1
  - RAM: 64 GB
- Software information
  - R version 4.4.1 (2024-06-14)
  - Packages used (please see the `sessionInfo()` output below for the full list of packages and their versions)


```r
R version 4.4.1 (2024-06-14)
Platform: aarch64-apple-darwin20
Running under: macOS 15.3.1

Matrix products: default
BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib 
LAPACK: /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.12.0

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

time zone: America/New_York
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] xtable_1.8-4    irr_0.84.1      lpSolve_5.6.20  lubridate_1.9.3 forcats_1.0.0   stringr_1.5.1  
 [7] dplyr_1.1.4     purrr_1.0.2     readr_2.1.5     tidyr_1.3.1     tibble_3.2.1    ggplot2_3.5.1  
[13] tidyverse_2.0.0

loaded via a namespace (and not attached):
 [1] utf8_1.2.4        generics_0.1.3    stringi_1.8.4     hms_1.1.3         digest_0.6.36    
 [6] magrittr_2.0.3    evaluate_0.24.0   grid_4.4.1        timechange_0.3.0  fastmap_1.2.0    
[11] fansi_1.0.6       scales_1.3.0      textshaping_0.4.0 cli_3.6.3         rlang_1.1.4      
[16] crayon_1.5.3      bit64_4.0.5       munsell_0.5.1     withr_3.0.0       yaml_2.3.10      
[21] tools_4.4.1       parallel_4.4.1    tzdb_0.4.0        colorspace_2.1-1  vctrs_0.6.5      
[26] R6_2.5.1          lifecycle_1.0.4   bit_4.0.5         vroom_1.6.5       ragg_1.3.2       
[31] pkgconfig_2.0.3   pillar_1.9.0      gtable_0.3.5      glue_1.7.0        systemfonts_1.1.0
[36] xfun_0.46         tidyselect_1.2.1  rstudioapi_0.16.0 knitr_1.48        farver_2.1.2     
[41] htmltools_0.5.8.1 rmarkdown_2.27    labeling_0.4.3    compiler_4.4.1   
```